Estimating the effect of cesarean delivery on long-term childhood health across two countries

Assessing the impact of cesarean delivery (CD) on long-term childhood outcomes is challenging as conducting a randomized controlled trial is rarely feasible and inferring it from observational data may be confounded. Utilizing data from electronic health records of 737,904 births, we defined and emulated a target trial to estimate the effect of CD on predefined long-term pediatric outcomes. Causal effects were estimated using pooled logistic regression and standardized survival curves, leveraging data breadth to account for potential confounders. Diverse sensitivity analyses were performed including replication of results in an external validation set from the UK including 625,044 births. Children born in CD had an increased risk to develop asthma (10-year risk differences (95% CI) 0.64% (0.31, 0.98)), an average treatment effect of 0.10 (0.07–0.12) on body mass index (BMI) z-scores at age 5 years old and 0.92 (0.68–1.14) on the number of respiratory infection events until 5 years of age. A positive 10-year risk difference was also observed for atopy (10-year risk differences (95% CI) 0.74% (-0.06, 1.52)) and allergy 0.47% (-0.32, 1.28)). Increased risk for these outcomes was also observed in the UK cohort. Our findings add to a growing body of evidence on the long-term effects of CD on pediatric morbidity, may assist in the decision to perform CD when not medically indicated and paves the way to future research on the mechanisms underlying these effects and intervention strategies targeting them.


Introduction
Medically indicated cesarean delivery (CD) is a lifesaving procedure for both mother and newborn, although it bears risk for short and long-term pediatric adverse health outcomes [1]. In the past decades, the incidence of CD increased annually by 4%, and is highly variable geographically, ranging from less than 5% of deliveries in southern Africa to almost 60% in some parts of Latin America [2]. This rise, which also reflects an increase in CDs without a medical or obstetric indication [3], is the result of cultural, personal and medico-legal reasons, with limited consideration to the impact that mode of delivery may have on long-term pediatric health. Previous studies have highlighted childhood obesity [4]; atopy, asthma, allergies, atopic dermatitis [5]; attention deficit hyperactivity disorder (ADHD), autistic spectrum disorder (ASD) [6]; autoimmune diseases [7] such as type 1 diabetes [8] as possible long-term outcomes of children born by CD. However, many of the studies were based on relatively small cohorts from a single geographic location, had a short follow-up period, did not account for possible confounders that might affect these associations and yielded diverse results in regard to the magnitude of the effect [1].
While the optimal incidence of CD is debatable [9][10][11], a better understanding of how delivery mode affects long-term health outcomes in children may influence health policies and decision processes of both clinicians and women. It could also serve as a basis for further research investigating the underlying biological mechanisms of these effects and possible interventions to improve pediatric outcomes of those born by CD. As conducting randomized controlled trials (RCTs) of delivery mode is rarely feasible and may be viewed as unethical, as it may increase the risk for adverse maternal and infant outcomes, we utilized state-of-the-art causal inference methods on high quality, high volume, longitudinal observational data that originates from Israel's largest healthcare provider, to assess long-term adverse outcomes of children born by CD.
Following an appropriate study design and methodologies tailored to the use of observational data for causal analysis has shown promise to overcome some of the issues that arise when working with such data [12]. One such study design is the Target Trial [13] framework. In this method, the observational analysis is designed to explicitly emulate an RCT, including definitions for trial eligibility and treatment assignment with time zero of follow-up. By emulating an RCT, flaws such as selection bias and immortal time bias can be avoided [14]. Effect estimates from observational data obtained using this framework were comparable to those obtained from RCTs [12].
In accordance with this framework, we envisioned a hypothetical RCT which could estimate the causal effect of CD on pediatric outcomes, and utilized observational data to emulate it as closely as possible. Table 1 summarizes key components of this RCT versus their corresponding definition in our observational data. Time-zero (which can be viewed as target trial initiation), the eligibility-determining date in which treatment strategies are assigned and follow-up starts, was defined as the time of birth. Survival curves were estimated by fitting a pooled logistic regression model [15,16], on a person-time data format. To adjust for baseline confounders we either added baseline selected variables to the logistic model (standardization) [17], or performed weighting using either IPW [18] or OW [19] (see Methods).

Results
Overall, 238,159 eligible births, 84.35% vaginal and 15.65% cesarean deliveries, were included in the Israel cohort (  (Table S2.1 in S1 Appendix). Children had a mean follow-up of 5.66 (SD 2.96) years and 6.77 (SD 4.54) years, in the Israel and UK cohorts, respectively. In both cohorts, the risk for atopy, asthma and allergy was higher in children born in CD compared to vaginal delivery. For these outcomes, the 10-year standardized risk differences observed were 0.74% (-0.06, 1.52) for atopy, 0.64% (0.31, 0.98) for asthma and 0.47% (-0.32, 1.28) for allergy in the Israel cohort and 0.41% (-0.42, 1.15), 1.04% (0.49, 1.71), and 0.23% (0.07, 0.36) in the UK cohort respectively (Table 3, Figs 2 and 3). A small risk difference was also found in the Israel cohort for ADHD, 0.34% (-0.16, 0.82) and for ASD, 0.24% (0.02, 0.46), however, in the UK cohort no difference was found for ASD, and a lower 10-year risk was found for ADHD among children born through CD-but this might result from the very low incident rate of ADHD in the UK cohort. P-values for the 10-year risk survival differences were obtained for these outcomes [20] in the full cohort, and corrected for multiple hypotheses [21] for each analysis method separately (Table S6.10 in S1 Appendix). The three outcomes with higher risk in children born in CD-atopy, asthma and allergy, had significant (<0.05) p-values after correction, in at least one analysis method and one cohort (Table S6.10 in S1 Appendix). The 10-year risk difference for asthma was significant both in the Israel and the UK cohorts. An average effect of 0.10 (0.07, 0.12) and 0.92 (0.68, 1.14) was found for BMI z-score at age 5-6 years old and for the number of respiratory infection incidents until 5 years of age in the Israel cohort accordingly (Table 4). Of note, in all of these outcomes, a similar trend was observed in all estimation strategies: IPW, OW, and standardization (see Methods), with the exception of 3. We excluded women with more than one previous CD in their medical history data 4. We define preterm births by premature_flag available from hospital data, or gestational age at birth below 37 weeks 5. We exclude children with birth weight less than 2500 grams 6. We exclude multiple gestations 7. We require information on mother-child linkages, delivery date, mode of delivery and neonatal sex IPW on the allergy outcome in the Israeli cohort (Tables S6.1, S6.3 in S1 Appendix). Additional predefined long-term pediatric outcomes, including death, type 1 diabetes mellitus, celiac, and autoimmune diseases had a negligible risk difference both in the Israel and UK cohort. Forearm fracture, which was used as a negative control (see Methods) also had a negligible risk difference, further validating our findings. As additional stringent analyses, we next studied the effect of CD in specific subpopulations as follows:

Elective CD subpopulation
While some studies have shown a similar association between elective and unscheduled CD and pediatric outcomes [22,23], others have shown different correlations depending on the type of CD performed [24,25], possibly due to the fact that unscheduled CDs could result from an emergency during labor, and therefore additional factors might affect the newborn's health. We therefore created a subgroup of children born by elective CD for both cohorts. However, while in the UK cohort, information on the type of CD was available, in the Israeli cohort it was not, and was therefore estimated by other parameters (see Methods). Overall, 217,258 children from the Israel cohort and 141,222 children from the UK cohort were included in these analyses (Tables S6.5, S6.6 in S1 Appendix). In both subgroups, standardized risk differences were still observed for asthma 0.37% (-0.09, 0.92) in the Israel cohort and 0.96% (0.05, 2.33) in the UK cohort respectively. The results for the other outcomes were not consistent in both subgroups (Table 3). In the Israeli cohort, the average treatment effect for BMI z-score at age 5-6 years old and for the number of respiratory infection incidents until 5 years of age was consistent in this subgroup with the findings in the full cohort (Table S6.3 in S1 Appendix).

Clinics matched subpopulation
Potential confounding factors such as environment, socioeconomic status and standard of care may have a substantial impact on the validity of the results. To investigate the sensitivity of our estimates to these factors, we analyzed a subpopulation that included 66,464 children born by either CD or vaginal deliveries grouped by maternal clinics and year of birth (see Methods, Table S6.7 in S1 Appendix). In this subgroup, similarly to the full Israel cohort, a small risk difference of 0.44% (0.08, 0.88) was observed for ASD. Standardized risk differences for atopy, asthma and allergy were 0.93% (-0.36, 2.02), 0.31% (-0.14, 0.82) and 0.23% (-1.15, 1.47) respectively, and an average treatment effect of 0.09 (0.07, 0.12) and 0.96 (0.76, 1.17) was observed for BMI z-score at age 5-6 years old and for the number of respiratory infection incidents until 5 years of age. Some of these risks are smaller than those seen in the full Israel population, and they were not consistent across all analysis methods (Table S6.2, Fig S6.1 in S1 Appendix), possibly due to the relatively small sample size of this group.

Siblings matched subpopulation
It has been previously shown that siblings studies might deal well with potential unobserved confounders related to environmental and genetic factors [26]. We analyzed a subpopulation  Table S6.8 in S1 Appendix). In this subgroup, we observed risk differences of 0.4% (-4.43, 5.92) for atopy, 1.7% (-0.37, 3.84) for asthma and 1.85% (-4.62, 8.32) for allergy, which also had positive 10-year risk differences across all analysis methods, apart from IPW on asthma (Table S6.2 in S1 Appendix, Fig S6.1 in S1 Appendix). Though the 95% CIs from OW on asthma include zero (Table S6.2 in S1 Appendix), and other results show larger CIs compared to the main analysis-these might stem from the very small size of this subpopulation. Overall, the same trend was observed for these outcomes in this subpopulation. Consistent with the full cohort, we observed an average treatment effect of 0.10 (0.07, 0.12) and 1.02 (0.81, 1.25) for BMI z-score at age 5-6 years old and for the number of respiratory infection incidents until 5 years of age.

Subpopulation of women with no history of a previous CD
For women with a history of one previous CD, clinicians recommend a trial of labor after cesarean delivery (TOLAC). Yet, although TOLAC is appropriate for many women, several factors increase the likelihood of a failed trial of labor. We analyzed a subpopulation of 220,041 children whose mothers did not have a history of CDs (Table S6.9 in S1 Appendix). In this subgroup, risk differences for atopy, asthma and allergy were 0.89% (0.15, 1.8), 0.82% (0.48, 1.15) and 0.24% (-0.88, 0.94) respectively. An average effect of 0.09 (0.07, 0.12) and 0.92 (0.70, 1.16) was found for BMI z-score at age 5-6 years old and for the number of respiratory infection incidents until 5 years of age in this subpopulation. These, along with results for other outcomes, were consistent with the results observed in the full cohort (Tables S6.2, S6.4 in S1 Appendix). Standardized survival curves for all outcomes, in the Israel and UK cohorts can be seen in Fig S6.2 in S1 Appendix. Unadjusted Kaplan-Meier curves for all outcomes and full results that include OW and IPW, as well as all sensitivity analyses can be found in Section 4 and 6 of the S1 Appendix.

Discussion
In this study, we utilized data from the largest HMO in Israel in order to assess long-term pediatric adverse outcomes of CD, compared to vaginal delivery. While evidence regarding the effect of birth mode on future long-term outcomes of children is accumulating [1,28], conducting an RCT to assess these effects is rarely feasible. Here, we adopted the target trial framework which relies on counterfactual reasoning [13] to analyse the effect of delivery mode on predefined outcomes among 238,159 children for a mean follow-up period of 5.66 (SD 2.96) years in Israel, and 163,272 children with a mean follow-up of 6.77 (SD 4.54) years in the UK. We revealed that CD had an effect on the occurrence of asthma, and may also have an effect on atopy, allergy and the number of respiratory infection events by the age of 5 years old both in the Israel and UK cohorts. An effect of CD on BMI z-scores at the age of 5 years old was also observed in the Israel cohort, but this result could not be validated in the UK cohort since routine anthropometric measurements for children were not available. Although results were not replicated for all outcomes in the different subgroups analysed, it may be a result of insufficient power to detect these differences in these small-scale cohorts, as similar trends were often observed.
The effect of mode of delivery on pediatric outcomes is hypothesized to be mediated by several mechanisms [29]. These include: hormonal surges and exposure to different levels of physical stress during labor, which may trigger protective developmental processes in the newborns and play a role in normal postnatal physiological development [30]; perturbations in the transmission of maternal microbiome to the infant during CD, which may result in different establishment and diversity of the microbiota thus possibly affecting future childhood health [31,32]; changes in the regulation of gene expression as a result of alterations in epigenetic patterns such as DNA-methylation [33]; abnormal short-term immune responses observed in infants born in CD, such as reduced expression of inflammatory markers [7,34]; and exposure to general anesthesia and anesthetic medications that may cross the placental barrier [35]. Among all pediatric outcomes analysed, CD had the largest effect on atopy and asthma development (0.74% and 0.64% accordingly in the Israel cohort, 0.41% and 1.04% accordingly in the UK cohort). Risk differences for asthma were also observed in several sensitivity analysis subpopulations; 0.37% (-0.09, 0.92) in the estimated elective CD subpopulation, 0.96% (0.05, 2.33) in the UK elective CD subpopulation, 0.31% (-0.14, 0.82) in the clinics matched subpopulation and 1.7% (-0.37, 3.84) in the siblings matched subpopulation. Notably, although the magnitude of these risk differences is relatively small, the relative risk of asthma diagnosis by 10 years of age increased by almost 10% in children born by CD in the Israeli cohort and by 8.91% in the UK cohort. The association between CD and the development of asthma was previously demonstrated in numerous studies. A meta-analysis found an \ increase of 20% in the risk of asthma in children who were delivered by CD [36,37]. A possible mechanism for this association was recently demonstrated in a study on the gut microbiome of children born by cesarean delivery showing an increased asthma risk only in children in whom microbiome composition at 1 year of age still retained a CD microbial signature, suggesting a role of altered maturation of the gut microbiota in the increased risk observed [32]. Another hypothesis underlying this association is an altered pulmonary physiology as a result of a delayed removal of amniotic fluid from the lung of infants that are born by CD, resulting in transient tachypnea of the newborn and higher incidence of respiratory distress syndrome after birth [38], which may increase the risk for asthma in the future. Interestingly, we found an average difference of 0.92 and 0.12 for the number of incidents of respiratory infections up to the age of 5 years old, in Israel and the UK respectively. This was previously demonstrated [39], and may partially mediate the effect observed on asthma, as viral infections are important causes of wheezing illnesses in children of all age ranges [40] and evidence demonstrating a link between early viral infections and asthma inception and exacerbations is accumulating [41].
The effect of CD on the incidence of atopy in children in our study is mostly driven by its effects on the development of asthma and allergy, each affected by CD when analysed separately. In the Israel cohort, the effect on atopy is larger partially due to the fact that in this cohort, atopic dermatitis is also positively affected by CD in contrast to the UK cohort ( Table 3). As expected, there is an overlap between these diagnoses (as presented for the Israel cohort in Fig S4.11 in S1 Appendix). Several previous studies did not find an association between CD and atopy [5,28] or atopic dermatitis [42,43]. However, Meta-analyses analysing this effect display significant heterogeneity between studies [5,28] with relatively short followup time in some of the studies(1-3 years) [42,43].
A smaller risk difference was found for ADHD, 0.34% (-0.16, 0.82), and ASD, 0.24% (0.02, 0.46), between infants born by CD and those born by vaginal delivery in the Israeli cohort. These differences were smaller in the estimated elective CD subpopulation-0.2% (-0.6, 0.92) and 0.19% (-0.05, 0.47), and even smaller in the full UK cohort--0.17% (-0.33, 0.03) and 0.03% (-0.27, 0.39), for ADHD and ASD respectively. Other subpopulations and methods used as sensitivity analyses resulted in wide estimates which included zero for both outcomes. Previous studies resulted in opposite conclusions regarding the effect of CD on the development of ADHD and ASD. While some studies found that children born by CD are roughly 20% more likely to be diagnosed with ASD [6], others concluded no association exists [44]. A previous study highlighted a possible association between general anesthesia during CD and ASD [45]. Lack of information on social parameters and variable diagnosis procedures of ADHD and ASD in different health systems might explain the variability in our results.
Finally, we have found CD had an average difference of 0.10 for BMI z-scores at age 5 years old. Very similar differences were found in all sensitivity analyses performed. The association of CD with subsequent obesity of the offspring was vastly studied, with mixed conclusions regarding the existence and magnitude of the effect. While some studies did not find any effect of delivery mode on childhood overweight [46], two meta-analyses of the literature concluded that CD is associated with an increased risk of subsequent obesity in offspring [47,48]. It has been hypothesized that other than the above mentioned mechanisms, the difference may also arise from an altered level of appetite regulation hormones, as evident from a lower concentration of circulating ghrelin [49], and a lower umbilical leptin concentration [50] in infants born by CD.
Our study has several strengths. First, we analyzed a large and comprehensive nationwide dataset including both maternal and offspring's data, with a long follow-up period. Second, we replicated our estimates using the same computational methods on an independent cohort which vary in both genetic and environmental factors, providing further validation to our findings. Finally, while the relation between CD and several long-term childhood outcomes has been previously investigated in many observational studies, most have concentrated on associations rather than explicitly pursuing a causal estimate.
However, our study also has several limitations. Our dataset does not contain information on some potential confounding factors such as maternal nutritional and environmental exposures. In order to minimize the effect of these confounders on our results we performed several sensitivity analyses. Another limitation is that our data did not include explicit information on whether the CD was a result of an emergency during labor versus an elective procedure, a differentiation that might be important. Although some studies found a similar association of the two CD types with pediatric outcomes [22,23], others did not [51]. To try to overcome this we analyzed a subpopulation of CDs which were classified as elective with a high probability. In addition, we were able to analyze the subpopulation of elective CD in the UK cohort. Our data also lack information on other potential confounding factors related to labor, such as information on mode of anesthesia and usage of anesthetic medications during surgery. Finally, our results are only valid for term, appropriate for gestational age infants.
In conclusion, by emulating a target trial on two cohorts from different countries, we found a small causal effect of CD on pediatric asthma and childhood BMI, and a possible effect on several other pediatric health outcomes, including atopy, allergy and respiratory infections. For other outcomes, such as ASD and autoimmune diseases, an increased risk was not observed consistently when employing different methods and across both countries. Our findings might contribute to the ongoing discussion on the optimal rate of CD, with an emphasis on its adversarial effect on long-term pediatric health and may enhance discussions between clinicians and parents regarding these risks. In addition, it may pave the way to future research on the mechanism underlying these effects and possible intervention strategies targeting them.

Ethics declarations
The study protocol in this research was approved by the Institutional Review Board (IRB) of the Rabin Medical Center, experiment protocol number 0158-19-RMC. Informed consent was waived by the IRB, as this is a retrospective study based on unidentified data taken from EHRs.

Use of IQVIA Medical Research Data (IMRD) is approved by the NHS London-South East
Research Ethics Committee (REC reference: 18/LO/0441); in accordance with this approval, the study protocol was reviewed and approved by an independent Scientific Review Committee (SRC) of IQVIA Inc. (reference number: 20SRC018).

Data
Data of the main cohort-Israel cohort, were extracted from the Clalit Health Services (Clalit) database, which is the largest Health maintenance organization (HMO) in Israel [52]. Clalit is a nongovernmental, nonprofit organization with an electronic health record (EHR) database of more than 5 million patients, representing over 50% of Israel's adult population (Section 1a in the S1 Appendix). The data includes anthropometrics measurements, blood pressure measurements, laboratory test results, diagnoses recorded by physicians, dispensed pharmaceuticals and family linkage.

Replication data-UK cohort
Data of the replication cohort-UK cohort, were extracted using primary care electronic health records from IQVIA Medical Research Data (IMRD), incorporating data from The Health Improvement Network (THIN, a Cegedim database). This database contains records of more than 12.5 million patients, covering approximately 6% of the UK population, and is representative of the population in terms of demographics and condition prevalence [53]. The data includes patient demographics, medical diagnoses, medication prescriptions, anthropometrics measurements and laboratory test results, which were transformed to the OMOP common data model [54].

Study population
We analyzed a total of 737,904 births, from 2002 until 2018. CD rates in Israel were relatively stable (roughly 17%) with a small linear decreasing trend during this time period (Fig S2.1 in S1 Appendix). The cohort included offsprings of women across the whole spectrum of social deprivation in Israel (see S1 Appendix). Eligible birth records were required to contain information which we defined as critical for our analysis, including mother-child linkage and at least 5 years of documented maternal medical history in Clalit's EHR data prior to delivery. Additional exclusion criteria were preterm birth (delivery prior to 37 completed gestational weeks) and low birth weight (birth weight below 2500 grams), as these factors can greatly impact both short-term and long-term pediatric health. Although birth weight is only measured after birth, it captures the infant's weight prior to birth, and is not affected by the treatment-delivery mode, and can therefore be considered as an exclusion in the target trial framework. While clinicians, supported by guidelines [55], generally recommend a trial of vaginal delivery after one CD, it is generally not offered after two or more CDs. Therefore, women with a history of 2 or more CDs were also excluded. Fig 1 describes eligibility criteria and flow chart of cohort selection, and Table 2 summarizes baseline characteristics of the 238,159 eligible children. 625,044 births, from 1994 until 2019 were analyzed in the UK cohort. Of these, 250,269 births could be linked to the child's medical record. Table S2.1 in S1 Appendix summarizes baseline characteristics of the 163,272 eligible children from this cohort.

Statistical analysis
Directed Acyclic Graphs (DAGs) [56,57], were constructed together with physicians expert in obstetrics and pediatrics (Fig S8.1 in S1 Appendix). We trained a propensity model [58] estimating the probability of being treated, i.e. giving birth by CD, using variables selected using the DAG described above. For learning the propensity model with a large number of covariates and to allow nonlinearities, we trained Gradient Boosting trees [59]. Evaluation of the propensity model and covariate balance was done in a similar manner to the workflow described by Shimoni, Y. et al. [60]. In Fig 4A the distributions of propensities for each treatment group are plotted, and overlap between delivery modes is observed at least up to a score of 0.4. Covariate balance before and after reweighting is presented in Fig 4B (see Section 5 in the S1 Appendix). Applying a feature attribution framework for machine learning models based on estimated Shapley values [61], we were able to estimate the contribution of the baseline covariates to the estimated propensity. This setup allowed us to capture any non-linear relationships between a covariate's contribution and the prediction value. For example, we observed a well-known [62] non-linear U-shaped impact of birth weight on the propensity score ( Fig 4C). Another variable that had a non-linear impact on the propensity model was the time of day at delivery. It is known that elective CDs are scheduled for daytime working hours (8am-4pm), while vaginal deliveries are expected to distribute uniformly throughout the entire day (Fig S3.3 in S1 Appendix and Fig 4D). Utilizing the additive property of Shapley values, we were also able to analyze groups of related features according to domain knowledge (Fig S5.3 in S1 Appendix).

Outcomes
Long-term childhood outcomes of CD were first identified by previous studies that demonstrated associations between these outcomes and the mode of delivery [1,7]. Each outcome was then defined and ascertained separately by a trained pediatrician. The diagnoses were based on the relevant ICD-9 codes, laboratory test results, and prescriptions of medications for the relevant medical conditions. When feasible, the diagnoses were based on previously published diagnostic approaches from EHR available in the literature or https://phekb.org/ phenotypes website, and in accordance with the Israeli healthcare policies and definitions (see section 4 in the S1 Appendix). Only outcomes with at least 10 diagnosed children who were born by each delivery mode were included in the main analysis, and the rest are presented in section 6 of the S1 Appendix. Censoring was used in case of death or if the child did not register to Clalit HMO until the age of 3 months.

Estimation of delivery mode effect
For each of the described outcomes, we estimate the causal effect with 3 different strategies: (1) Weighting by standard (non-stabilized) inverse-probability-weighting (IPW) [18], (2) Weighting by overlap-weights (OW) [19], and (3) Standardization [17]. The different outcomes can be categorized to 2 different types: (I) time-to-event outcomes (such as asthma onset), for which survival curves were estimated; and (II) fixed continuous outcomes (such as BMI zscore), for which average treatment effect (ATE) differences were estimated. For time-to-event outcomes, survival curves were constructed by fitting a pooled logistic regression model [15,16], on a person-time data format. Time resolution in the person-time format was 4 months, and the logistic model was fitted with a time-varying intercept and product terms between time and treatment [63]. To adjust for baseline confounders, we either added baseline selected variables to the logistic model (standardization), or performed weighting using either IPW or OW. Both weighting methods were evaluated when constructing the propensity model, OW resulted in a weighted population with better covariance balance (Fig S5.5 in S1 Appendix). 95% CIs were estimated by bootstrap sampling with 100 iterations.

Sensitivity analysis
As some of our assumptions cannot be verified from the data, we performed several sensitivity analyses as described below.

Negative controls
One tool to detect unmeasured confounding is the use of negative controls [64]. Here, upper forearm fracture (upper end of radius and ulna), a relatively common diagnosis in children, was chosen as negative control since no studies thus far indicated that there is a plausible association between this diagnosis and CD.

Elective CD subpopulation
In the UK cohort, data on whether CD was unscheduled or elective was available, allowing us to create an elective CD subpopulation, which includes 141,222 children. Baseline characteristics of this subpopulation can be found in Table S6.6 in S1 Appendix. In the Israel data, a clear distinction between unscheduled and elective CD was not available. To estimate the probability of elective CD in this cohort, we utilized an additional data set, in which the type of CD was specified, from in-hospital electronic records, obtained from Rabin Medical Center, the third largest medical center in Israel. The data set contained information on 56,260 births between 2012-2020, of which 3,827 were by unscheduled CD and 3,972 by elective CD. Using this additional data, we built a model that predicts elective CD from variables that are also present in the original Clalit EHR database (see section 7 in the S1 Appendix). We used this trained model to obtain predictions for elective CD in our cohort, and created a subpopulation which contained vaginal deliveries from our full study population and CDs which were predicted as elective with a high probability (greater or equal to 0.5, selected according to the distribution of predicted probabilities) (Fig S7.2 in S1 Appendix).

Clinics matched subpopulation
Matching by clinics can assist in minimizing confounders such as socioeconomic status that may differ between geographic locations. In this subpopulation, children were grouped by maternal clinics (defined as the most frequently visited clinic prior to pregnancy) and year of birth. In each group we sampled an equal number of CDs and vaginal deliveries.

Siblings matched subpopulation
To create a subgroup of siblings we made use of the longitudinal and large-scale family data available in the Clalit EHRs and built a sensitivity analysis based on a subpopulation of discordant siblings comprised of pairs of siblings born in discordant birth modes, where both siblings are with the same sex, and matching the birth order by the number of first CD and first vaginal pairs.

Subpopulation of women with no history of a previous CD
Using the medical history data available in Clalit, we excluded any women who had a history of birth by CD. 220,041 children, born to mothers with no CD history were included in this subpopulation.